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Abstract 

Network  models  are  widely  used  to  represent  relational  information  among  interacting  units. 
In  studies  of  social  networks,  recent  emphasis  has  been  placed  on  random  graph  models  where 
the  nodes  usually  represent  individual  social  actors  and  the  edges  represent  the  presence  of 
a  specified  relation  between  actors.  We  develop  a  class  of  models  where  the  probability  of 
a  relation  between  actors  depends  on  the  positions  of  individuals  in  an  unobserved  “social 
space.”  Inference  for  the  social  space  is  developed  within  a  maximum  likelihood  and  Bayesian 
framework,  and  Markov  chain  Monte  Carlo  procedures  are  proposed  for  making  inference  on 
latent  positions  and  the  effects  of  observed  covariates.  We  present  analyses  of  three  standard 
datasets  from  the  social  networks  literature,  and  compare  the  method  to  an  alternative 
stochastic  blockmodeling  approach.  In  addition  to  improving  upon  model  fit,  our  method 
provides  a  visual  and  interpretable  model-based  spatial  representation  of  social  relationships, 
and  improves  upon  existing  methods  by  allowing  the  statistical  uncertainty  in  the  social  space 
to  be  quantified  and  graphically  represented. 

KEY  WORDS:  Network  data;  latent  position  model;  conditional  independence  model. 


1  Introduction 


Social  network  data  typically  consist  of  a  set  of  n  actors  and  a  relational  tie  r/jj,  measured 
on  each  ordered  pair  of  actors  i,j  =  1, . . .  ,n.  This  framework  has  many  applications  in  the 
social  and  behavioral  sciences  including,  for  example,  the  behavior  of  epidemics,  the  inter¬ 
connectedness  of  the  World  Wide  Web,  and  telephone  calling  patterns.  Quantitative  research 
on  social  networks  has  a  long  history  going  back  at  least  to  Moreno  (1934).  The  development 
of  log-linear  statistical  models  by  Holland  and  Leinhardt  (1977,  1981),  Fienberg,  Meyer,  and 
Wasserman  (1985),  Wang  and  Wong  (1987),  and  others  represent  major  advances. 

In  the  simplest  cases,  i/ij  is  a  dichotomous  variable,  indicating  the  presence  or  absence 
of  some  relation  of  interest,  such  as  friendship,  collaboration,  transmission  of  information 
or  disease,  etc..  The  data  are  often  represented  by  an  n  x  n  sociomatrix  Y.  In  the  case  of 
binary  relations,  the  data  can  also  be  thought  of  as  a  graph  in  which  the  nodes  are  actors 
and  the  edge  set  is  '■  Vij  =  1}.  When  {i,j)  is  in  the  edge  set  we  write  i  — >■  j.  If  ties 

are  undirected,  in  that  i/ij  =  yj^i  for  all  i  ^  j  hy  logical  necessity,  we  write  i  ^  j  if  Vij  =  1- 
However,  even  in  the  case  of  directed  relations,  ties  often  tend  to  be  reciprocal  {yij  =  yj^i 
with  high  probability)  and  transitive  {i^j,j^k^i^k  with  high  probability).  As  such, 
probabilistic  models  of  network  relations  have  typically  allowed  for  some  sort  of  dependence 
between  ties.  For  example,  the  pi  model  of  Holland  and  Leinhardt  (1981)  includes  parameters 
for  the  propensity  of  ties  to  be  reciprocal,  as  well  as  parameters  for  the  number  of  ties  and 
individual  tendencies  to  give  or  receive  ties.  However  these  models  are  restrictive  as  they 
assume  the  (”)  dyads  {yij,  yj^i)  to  be  independent. 

Frank  and  Strauss  (1986)  characterized  the  exponential  family  of  random  graph  models 
by  elaborating  work  of  Besag  (1974)  developed  in  the  context  of  spatial  statistics.  These 
have  been  referred  to  as  the  “p*  ”  class  of  models  in  the  psychology  and  sociology  literatures 
(Wasserman  and  Pattison,  1996).  Given  their  general  nature  and  applicability,  we  shall  refer 
to  them  simply  as  (exponentially  parametrized)  random  graph  models.  Frank  and  Strauss 
(1986)  also  proposed  models  with  Markov  structure  that  allow  for  forms  of  dyad  dependence, 
often  referred  to  as  homogeneous  monadic  Markov  models.  Recent  work  of  Corander  et  al. 
(1998),  Crouch,  Wasserman  and  Trachtenberg  (1998),  Besag  (2000),  Handcock  (2000)  and 
Snijders  (2001)  has  developed  likelihood-based  inference  for  these  models  based  on  Markov 
Chain  Monte  Carlo  algorithms.  Approximate  maximum  likelihood  approaches  had  been 
developed  by  Frank  and  Strauss  (1986),  Strauss  and  Ikeda  (1990),  and  Wasserman  and 
Pattison  (1996).  However  the  statistical  properties  of  these  “pseudolikelihood”  estimators 
are  only  partially  understood. 

Recent  works  have  explored  the  properties  of  homogeneous  monadic  Markov  models.  Re- 
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suits  in  Besag  (2000)  and  Handcock  (2000)  suggest  that  commonly  used  models  are  more 
global  than  local  in  structure  and  this  contributes  to  model  degeneracy  and  instability  prob¬ 
lems  (Ruelle  1968).  These  issues  are  not  resolved  by  alternative  forms  of  estimation  but 
represent  defects  in  the  models  themselves  -  at  least  to  the  extent  that  they  are  useful  for 
modeling  realistic  social  networks.  These  factors  have  motivated  the  development  of  alter¬ 
native  models  without  these  restrictions. 

For  networks  in  which  actors  belong  to  prespecified  groups,  Wang  and  Wong  (1987) 
developed  a  stochastic  blockmodel,  an  extension  of  the  pi  model,  which  includes  parameters 
describing  differential  rates  of  between-group  and  within-group  ties.  For  cases  in  which  group 
membership  is  not  observed,  Nowicki  and  Snijders  (2001)  presented  a  model  in  which  the  ties 
in  a  social  network  are  conditionally  independent,  given  the  latent  class  membership  of  each 
actor.  In  such  a  model,  actors  within  a  latent  class  are  treated  as  stochastically  equivalent 
that  is,  the  events  {ii  — >■  ji)  and  (^2  — >■  ^2)  have  the  same  probability  if  actors  ii  and  ji  are  in 
the  same  respective  latent  classes  as  ^2  and  j2-  Such  a  model  may  prove  useful  in  identifying 
clusters  of  individuals  for  whom  stochastic  equivalence  holds,  that  is,  clusters  of  individuals 
who  relate  to  all  other  actors  in  the  system  in  a  similar  way.  However,  models  based  on 
distinct  clusters  may  not  fit  well  when  many  actors  fall  between  clusters,  or  when  relations 
are  transitive  yet  there  is  no  strong  clustering. 

In  some  social  network  data,  the  probability  of  a  relational  tie  between  two  individuals 
may  increase  as  the  characteristics  of  the  individuals  become  more  similar.  A  subset  of  indi¬ 
viduals  in  the  population  with  a  large  number  of  social  ties  between  them  may  be  indicative 
of  a  group  of  individuals  who  have  nearby  positions  in  this  space  of  characteristics,  or  “social 
space.”  Note  that  if  some  of  the  characteristics  are  unobserved,  then  a  probability  measure 
over  these  unobserved  characteristics  induces  a  model  in  which  the  presence  of  a  tie  between 
two  individuals  is  dependent  on  the  presence  of  other  ties.  Relations  modeled  as  such  are 
probabilistically  transitive  in  nature:  the  observation  of  i  — >■  j  and  j  ^  k  suggests  that  i  and 
k  are  not  too  far  apart  in  social  space,  and  therefore  are  more  likely  to  have  a  tie.  In  Section 
2,  we  develop  a  latent  variable  model  for  such  transitive  relations,  where  it  is  assumed  each 
actor  i  has  an  unknown  position  Zi  in  social  space.  The  ties  in  the  network  are  assumed  to  be 
conditionally  independent  given  these  positions,  and  the  probability  of  a  specific  tie  between 
two  individuals  is  modeled  as  some  function  of  their  positions,  such  as  the  distance  between 
the  two  actors  in  social  space.  Estimation  of  positions  is  simplified  by  the  use  of  a  logistic 
regression  model,  and  confidence  regions  for  latent  positions  are  computable  using  standard 
MCMC  algorithms,  as  described  in  Section  3.  In  Section  4,  these  latent-space  models  are  fit 
to  a  number  of  standard  datasets,  and  their  performance  in  terms  of  model  fit  is  compared  to 
alternative  stochastic  blockmodels.  In  addition  to  improving  upon  model  fit,  the  results  from 


2 


our  approach  are  relatively  easy  to  interpret,  and  modeling  the  positions  as  belonging  to  a 
low-dimensional  Euclidean  space  provides  a  model-based  means  of  graphically  representing 
social  network  data. 

2  Latent  Position  Methods 

The  data  we  model  in  this  paper  consist  of  an  n  x  n  sociomatrix  Y,  with  entries  r/*  j  denoting 
the  value  of  the  relation  from  actor  i  to  actor  j,  and  possibly  additional  covariate  information 
X .  We  focus  on  binary- valued  relations,  although  the  methods  in  this  paper  can  be  extended 
to  more  general  relational  data  using  ideas  from  generalized  linear  models.  Both  directed 
and  undirected  relations  can  be  analyzed  with  our  methods,  although  the  features  of  the 
model  are  slightly  different  in  the  two  cases,  as  described  below. 

We  take  a  conditional  independence  approach  to  modeling  by  assuming  that  the  presence 
or  absence  of  a  tie  between  two  individuals  is  independent  of  all  other  ties  in  the  system, 
given  the  unobserved  positions  in  social  space  of  the  two  individuals: 

P{Y\Z,X,9)  =  Y[P{yi,j\zi,Zj,Xij,e), 

where  X  and  Xij  are  observed  characteristics  which  are  potentially  pair-specific  and  vector¬ 
valued,  and  9  and  Z  are  parameters  and  positions  to  be  estimated. 

2.1  Distance  Models 

A  convenient  parametrization  of  P{yij\zi,  Zj,  Xij,  9)  is  the  logistic  regression  model  in  which 
the  probability  of  a  tie  depends  on  the  Euclidean  distance  between  Zi  and  Zj,  as  well  as  on 
covariates  Xij  that  measure  characteristics  of  the  dyad: 

Tjij  =  logodds(r/jj  =  l\zi,  Zj,  Xij,  a,/3)  =  a  +  /3'xij  -  \zi  -  Zj\.  (1) 

This  model  has  a  simple  interpretation:  for  two  actors  j  and  k  equidistant  from  i,  the  log 
odds  ratio  of  i  — >■  j  versus  i  — >■  A:  is  I3'{xij  —  Xi^k)- 

Note  that  the  \zi  —  Zj\'s  could  be  replaced  by  an  arbitrary  set  of  distances  {dij},  satisfying 
the  triangle  inequality,  dij  <  di^k  +  dkj  V  {i,  j.  A:}.  A  semiparametric  modeling  approach 
would  impose  no  further  constraints  on  the  distances,  and  so  the  parameter  space  would 
include  (2)  distances  to  estimate,  subject  to  the  inequality  constraints.  Generally,  we  prefer 
to  model  the  djj’s  as  being  distances  between  actors  in  some  low-dimensional  Euclidean 
space  for  reasons  of  parsimony  and  ease  of  model  interpretability. 
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The  latent  position  model  is  inherently  reciprocal  and  transitive:  Hi  ^  j  and  j  k,  then 
dij  and  are  probably  not  too  large,  making  more  probable  the  events  j  ^  i  (reciprocity) 
and  i  ^  k  (transitivity).  One  interesting  feature  of  the  model  is  it  provides  an  essentially 
perfect  model  fit  for  many  social  network  datasets  with  undirected  relations,  in  a  parameter 
space  of  much  lower  dimension  than  that  of  the  data.  To  explore  this  feature  further, 
consider  the  following  reparametrization  of  (1)  in  the  case  of  no  covariate  information  and 
an  undirected  relation  i/ij  =  z/j/. 

logodds(r/jj  =  l\dij,  Xij,  a)  =  a{l  -  dij).  (2) 

We  say  a  set  of  distances  {dij}  represents  the  network  Y  if 

{djj  >  1  Vi,  j  :  Vij  =  0}  and  (3) 

{dij<iyiJ:yij  =  l}  . 

For  such  a  set  of  distances,  the  probability  of  the  data  under  parametrization  (2)  will  converge 
to  unity  as  cr  — >■  oo.  As  we  will  be  modeling  the  distances  as  being  Euclidean  distances  in 
some  A:-dimensional  space,  we  will  say  a  network  is  d^-representable  if  there  exist  points 
Zi  G  such  that  the  distances  dij  =  \zi  —  Zj  \  satisfy  (3).  In  such  a  space,  d;t-representability 
is  equivalent  to  being  able  to  find  a  set  of  points  for  the  actors  such  that  i  ~  j  if  and  only  if 
i  and  j  lie  within  A:-dimensional  unit  balls  centered  around  each  other. 

It  is  interesting  to  note  that  there  are  many  examples  of  social  networks  which  are  d^- 
representable  for  k  much  smaller  than  n,  and  even  for  k  =  2.  For  example,  consider  an 
n-star  network  composed  of  one  central  actor  having  ties  to  n  —  1  otherwise  unconnected 
actors.  Such  a  network  is  trivially  d|_i-representable  for  any  n,  by  positioning  pairs  of 
non-central  actors  on  either  sides  of  the  central  actor  along  one  of  the  n/2  coordinate  axes. 
As  another  example,  consider  an  n-chain  network,  in  which  there  is  an  ordering  of  n  actors 
so  that  l~2~3~---~n~l.  This  network  is  d2-representable  for  all  n  by  placing  the 
actors  equidistant  from  the  origin  but  separated  by  equal  angles.  Such  results  suggest  that 
distance-based  models  may  provide  a  good  method  of  data  reduction  and  presentation  for 
undirected  relational  data.  Although  the  above  examples  may  seem  contrived,  in  Section  4.2 
we  analyze  a  real-life  15  actor  network  which  is  d2-representable. 

2.2  Projection  Methods 

The  distance  model  presented  above  is  inherently  symmetric,  in  that  p{i  — >■  j)  =  p{j  — >■  i). 
However,  in  many  networks  such  symmetry  is  not  achieved.  For  example,  perhaps  actor  i 
sends  a  large  number  of  ties  whereas  j  sends  ties  to  a  small  subset  of  the  actors  receiving 
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ties  from  i.  In  this  case,  we  want  to  model  both  that  i  and  j  are  “similar”  but  that  i  is 
more  “socially  active” .  Such  a  model  could  be  achieved  by  including  actor-specific  activity 
parameters,  an  approach  used  by  by  Wang  and  Wong  (1987)  to  allow  for  actor-level  variability 
in  their  stochastic  blockmodel. 

Alternatively,  variable  activity  can  be  modeled  parsimoniously  in  the  context  of  a  la¬ 
tent  position  model  which  allows  for  probabilistic  transitivity  in  the  relations,  as  well  as 
individual-specific  levels  of  social  activity.  Suppose  each  actor  i  has  an  associated  unit- 
length  A:-dimensional  vector  of  characteristics  n*.  These  characteristics  can  be  thought  of  as 
points  on  a  A:-dimensional  sphere  of  unit  radius.  We  might  imagine  that  i  and  j  are  prone 
to  having  ties  if  the  angle  between  them  is  small,  neutral  to  having  ties  if  the  angle  is  a 
right  angle,  and  averse  to  ties  if  the  angle  is  obtuse.  These  three  situations  correspond  to 
v[vj  >  0,  v[vj  =  0,  and  v[vj  <  0,  respectively.  In  other  words,  i  and  j  are  more  likely  to 
have  a  tie  if  the  characteristics  of  i  and  j  are  in  the  same  direction,  and  less  likely  to  have  a 
tie  if  they  have  characteristics  in  opposite  directions.  Adding  a  parameter  for  each  node  to 
allow  for  different  levels  of  activity  is  equivalent  to  having  latent  vectors  of  various  lengths: 
letting  a*  >  0  be  the  activity  level  of  actor  i,  we  can  model  the  probability  of  a  tie  from  i  to 
j  as  depending  on  the  magnitude  of  aiv'^Vj,  or  equivalently,  2;,- 2:^/1 2:^1,  where  Zi  =  ajU*.  This 
is  the  signed  magnitude  of  the  projection  of  Zi  in  the  direction  of  Zj,  and  can  be  thought 
of  the  extent  to  which  i  and  j  share  characteristics,  multiplied  by  the  activity  level  of  i. 
For  convenience,  we  will  parametrize  the  probability  of  a  tie  from  i  to  j  using  the  logistic 
regression  model  as  before: 

Z^-Z  ' 

logodds(?/jj  =  l|2;j,  Zj,  Xij,  a,  /3)  =  a  +  /3'xij  -|-  -p-w- 

In  some  situations  we  may  wish  to  model  differential  rates  of  accepting  ties.  In  this  case, 
the  above  probability  could  depend  on  the  latent  vectors  through  zlzj/\zi\. 

3  Estimation 

In  contrast  to  the  p*  and  Markov  random  graph  models,  the  log-likelihood  of  a  conditional 
independence  model  is  relatively  simple: 

\og P{Y\rj)  =  -  log(l  +  (4) 

where  r;  is  a  function  of  parameters,  unknown  positions,  and  perhaps  known  explanatory 
variables.  As  such,  likelihood-based  estimation  methods,  such  as  maximum-likelihood  and 
Bayesian  inference,  are  feasible. 
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The  likelihood  (4)  is  strictly  concave  in  the  matrix  r]  =  {r}ij}-  Consider  first  the  semi- 
parametric  model  T]  =  all'  —  D,  where  D  is  constrained  only  to  be  a  positive  symmetric 
matrix  of  values  satisfying  the  triangle  inequality.  As  the  parameter  space  {a,  D}  is  convex 
and  ri{a,D)  is  affine,  there  is  a  unique  value  of  all'  —  D  maximizing  the  likelihood  (note, 
however,  that  a  is  confounded  with  T),  as  addition  of  a  positive  constant  to  a  set  of  distances 
is  also  a  set  of  distances).  Unfortunately,  the  log-likelihood  is  not  generally  concave  in  {a,  Z} 
for  either  the  distance  model  or  the  projection  model,  as  the  function  r]  =  r]{a,  Z)  is  not 
afhne.  This  makes  identification  of  a  global  MLE  problematic.  However,  one  approach  is 
to  first  identify  a  set  of  distances,  not  necessarily  Euclidean,  which  maximize  the  likelihood 
(a  convex  minimization  problem).  A  set  of  positions  in  approximating  the  distances  can 
then  be  found  using  using  multidimensional  scaling  methods.  This  set  of  positions  can  be 
used  as  a  starting  point  in  a  non-linear  optimization  routine.  A  simpler  approach  which 
works  well  in  the  examples  in  this  paper  is  to  obtain  a  set  of  dissimilarities  between  nodes 
based  on  an  ad  hoc  measure,  such  as  the  Euclidean  distances  between  rows  or  columns  of 
the  sociomatrix,  or  the  geodesic  distance  (path  length)  between  the  nodes  (Wasserman  and 
Faust  1994).  Starting  values  for  the  positions  can  then  be  found  using  multidimensional 
scaling. 

Distances  between  a  set  of  points  in  Euclidean  space  are  invariant  under  rotation,  re¬ 
flection,  and  translation.  Therefore,  for  each  k  x  n  matrix  of  latent  positions  Z  there 
is  an  infinite  number  of  other  positions  giving  the  same  log-likelihood.  More  specifically, 
logPr(T|Z,  a)  =  logPr(T|Z*,  a)  for  any  Z*  which  is  equal  to  Z  under  the  operations  of  re¬ 
flection,  rotation,  or  translation.  A  confidence  region  which  includes  two  equivalent  positions 
Zi  and  Z2  is  in  a  sense  overestimating  the  variability  in  the  unknown  positions  (although 
not  overestimating  the  variability  in  distances  or  relative  positions,  as  these  are  identical  for 
Zi  and  Z2).  Fortunately,  this  problem  can  be  resolved  by  basing  inference  on  equivalence 
classes  of  latent  positions:  let  [Z]  be  the  class  of  positions  equivalent  to  Z  under  rotation, 
reflection,  and  translation.  For  each  [Z],  there  is  one  set  of  distances  between  the  nodes.  We 
call  this  class  of  positions  a  configuration. 

We  make  inference  on  configurations  via  inference  on  particular  elements  of  configurations 
which  are  comparable  across  configurations.  For  a  given  configuration  [Z],  we  select  for 
inference  Z*  =  argminr2tr(Zo  —  TZ)'{Zq  —  TZ),  where  Zq  is  a  fixed  set  of  positions  and 
T  ranges  over  the  set  of  rotations,  reflections,  and  translations.  Z*  is  a  “Procrustean” 
transformation  of  Z,  being  the  element  of  [Z]  closest  to  Zq  in  terms  of  the  sum  of  squared 
positional  differences,  and  is  unique  if  ZqZ'  is  nonsingular  (Sibson,  1979).  Z*  is  relatively 
easy  to  compute:  assuming  Z  and  Zq  are  both  centered  at  the  origin,  Z*  is  given  by  Z*  = 
ZqZ' [Z Z'qZqZ')~^I‘^ Z .  We  will  typically  take  Zq  =  Z,  an  MLE  of  the  latent  positions  centered 
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at  the  origin. 

Given  prior  information  on  a,  jS,  and  Z,  our  procedure  for  sampling  from  the  posterior 
distribution  is  as  follows: 

1.  Identify  an  MLE  Z  of  Z^  centered  at  the  origin,  by  direct  maximization  of  the  likeli¬ 
hood. 

2.  Using  Zq  =  Z  as  a  starting  value,  construct  a  Markov  Chain  over  model  parameters 
as  follows: 

(a)  Sample  a  proposal  Z  from  J{Z\Zi;),  a  symmetric  proposal  distribution; 

(b)  Accept  Z  as  Z^+i  with  probability  otherwise  set  Z^+i  =  Z^, 

(c)  Store  Zk+i  =  argmiuTz^^+i  tr{Z  -  TZ^+iYiZ  -  TZk+i). 

3.  Update  ol  and  {3  with  a  Metropolis-Hastings  algorithm. 

Since  each  configuration  can  be  represented  by  its  unique  Procrustean  statistic,  the  posterior 
distribution  of  the  configuration  around  Z  is  represented  by  samples  of  Z  from  the  Markov 
chain. 

The  computational  details  for  the  projection  model  are  the  same  as  above,  except  that 
the  likelihood  is  invariant  under  rotation  and  reflection  of  positions,  but  not  translation. 
Therefore,  the  only  modification  to  the  above  is  to  let  Z^+i  =  argmiur^^.^^  tY{Z—TZkj^i)'{Z— 
TZk+i).,  where  T  ranges  over  the  set  of  rotations  and  reflections. 

4  Examples 

We  analyze  three  standard  datasets  from  the  social  networks  literature:  Sampson’s  (1968) 
Monk  data,  Padgett  and  Ansell’s  (1993)  data  on  marriage  relations  between  Florentine 
families,  and  Hansell’s  (1984)  classroom  data. 

4.1  Monk  Data 

Sampson  (1968)  collected  data  on  a  variety  of  interpersonal  relations  among  18  monks.  Of 
particular  interest  has  been  the  data  on  positive  affect  relations,  in  which  each  monk  was 
asked  if  they  had  positive  relations  to  each  of  the  other  monks.  Based  on  the  network  and 
other  data,  Sampson  originally  classified  each  monk  as  belonging  to  one  of  four  groups;  the 
Loyal  Opposition  (monks  2-6)  ;  the  Young  Turks  (monks  8-14)  ;  the  Outcasts  (monks  16-18); 
and  the  Waverers  (monks  1,7,15).  Subsequent  data  analyses  have  placed  monks  1  and  7 
with  the  Loyal  Opposition,  and  monk  15  with  the  Outcasts. 
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These  data  are  standard  in  the  social  network  analysis  literature,  having  been  modeled 
by  Holland  and  Leinhardt  (1981),  Reitz  (1982),  Holland,  Laskey  and  Leinhardt  (1983),  and 
Fienberg,  Meyer,  and  Wasserman  (1981).  Wang  and  Wong  (1987)  extended  these  models 
by  allowing  for  individual  level  variation  in  relations  as  well  as  group-level  preferences  for 
ties,  and  obtained  a  substantially  improved  fit.  Specifically,  their  stochastic  blockmodel 
modeled  each  pair  {yij,  yj^i}  as  depending  on  parameters  for  actor-specihc  rates  of  sending 
and  receiving  ties,  a  parameter  representing  mutuality  of  ties,  and  a  parameter  representing 
the  preference  of  actors  to  send  ties  to  members  of  their  own  group.  Note  that  Wang  and 
Wong  took  the  group  membership  information  as  given,  even  though  it  was  derived  to  some 
extent  from  the  data. 

The  relations  between  the  monks  are  somewhat  transitive:  the  number  of  non- vacuously 
transitive  ordered  triples  (i  —>■  j,  j  ^  k,  i  ^  k)  is  49.  In  500  random  reallocations  of  ties, 
holding  the  number  of  ties  sent  by  each  actor  constant,  the  largest  number  of  non- vacuously 
transitive  triads  was  35.  The  distance  model  we  fit  to  the  data  takes  advantage  of  this 
transitivity,  and  achieves  a  better  fit  than  Wang  and  Wong’s  model,  using  fewer  parameters 
and  not  presuming  the  a  priori  existence  of  distinct  groups.  Our  model  is  the  distance  model 
presented  in  Section  2.1, 

n 

P{Y\a,Z)  =  Ylp{yij\a,  Zi,  Zj)  (5) 

logit  p{yij  =  l\a,Zi,Zj)  =  a  -  \zi  -  Zj\, 

where  the  Zi's  lie  in  Note  that  the  probability  of  the  data  depends  only  on  the  distances, 
which  are  invariant  under  reflection,  rotation,  and  location  shift.  As  a  result,  three  of  the 
18  X  2  model  parameters  can  be  fixed,  so  this  model  has  33  -|-  1  =  34  parameters  (including 
a). 

The  distance  between  each  pair  of  nodes  was  first  calculated  as  the  average  of  the  two 
directed  path  lengths  between  each  pair.  Crude  estimates  of  latent  positions  were  then  found 
using  multidimensional  scaling,  and  the  results  were  used  as  starting  values  for  the  non¬ 
linear  minimizer  optim  in  the  R  statistical  programming  environment.  Random  sampling  of 
starting  values  from  a  normal  distribution  produced  identical  results. 

As  shown  in  Table  1,  the  maximized  log-likelihood  is  -66.02  with  34  parameters,  compared 
to  the  maximized  log-likelihood  of  the  stochastic  blockmodel  fit  of  -82.12  with  37  parameters 
(Wang  and  Wong,  1987).  The  improvement  of  the  position-based  model  over  the  stochastic 
blockmodel  of  Wang  and  Wong  suggests  that,  since  relationships  are  indeed  transitive  to 
some  extent,  modeling  them  as  such  leads  to  an  improvement  in  model  fit.  The  maximum 
likelihood  estimates  of  monk  positions  from  the  distance  model  are  shown  in  the  panel  (a) 
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Table  1:  Model  fitting  results  for  the  monk  data 


Model 

Maximized  log-likelihood 

#  parameters 

Distance  model 

-66.02 

34 

Stochastic  blockmodel 

-82.12 

37 

Figure  1:  Maximum  likelihood  estimates  (a)  and  Bayesian  marginal  posterior  distributions 
(b)  for  monk  positions.  The  direction  of  a  relation  is  indicated  by  an  arrow. 


of  Figure  1 

The  conditional  independence  model  lends  itself  relatively  easily  to  a  Bayesian  analysis: 
priors  can  be  formulated  for  ol  and  Z,  and  posterior  inference  can  be  made  about  each.  In 
particular,  this  provides  a  means  of  making  confidence  regions  for  the  positions  of  the  actors 
in  social  space.  Using  diffuse  independent  normal  priors  for  ol  and  Z,  having  means  of  zero 
and  standard  deviations  of  100,  a  Bayesian  analysis  was  performed  via  2.5  x  10®  scans  from 
a  Markov  chain  as  described  in  Section  3.  The  chain  mixes  reasonably  quickly  in  the  2;j’s, 
but  quite  slowly  in  a  as  shown  in  panel  (b)  of  Figure  2.  Output  from  the  chain  was  saved 
every  2  x  10^  scan,  and  positions  of  the  different  monks  are  plotted  for  each  saved  scan  in 
panel  (b)  of  Figure  1  (the  plotting  color  for  each  monk  is  based  on  their  mean  angle  from  the 
positive  x-axis  and  their  mean  distance  from  the  origin).  The  categorization  of  the  monks 
given  at  the  beginning  of  this  section  is  validated  by  the  distance  model  fitting,  as  there  is 
little  between-group  overlap  in  the  posterior  distribution  of  monk  positions.  Additionally, 
this  model  is  able  to  quantify  the  extent  to  which  some  actors  (such  as  monk  15)  lie  between 
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Figure  2:  MCMC  diagnostics  for  the  monk  analysis 
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other  groups  of  actors. 

4.2  Florentine  Families 

Padgett  and  Ansell  (1993)  compiled  data  on  marriage  and  business  relations  between  16 
historically  prominent  Florentine  families,  using  a  history  of  this  period  given  by  Kent  (1978). 
We  analyze  data  on  the  marriage  relations  taking  place  during  the  15th  century.  The  actors 
in  the  population  are  families,  and  a  tie  is  present  between  two  families  if  there  is  at  least 
one  marriage  between  them.  This  is  an  undirected  relation,  as  the  respective  families  of  the 
husband  and  wife  in  each  marriage  were  not  recorded.  One  of  the  sixteen  families  had  no 
marriage  ties  to  the  others,  and  was  consequently  dropped  from  the  analysis  (if  included, 
this  family  would  have  infinite  distance  from  the  others  in  a  maximum  likelihood  estimation, 
and  a  large  but  finite  distance  in  a  Bayesian  analysis,  as  determined  by  the  prior). 

Modeling  dij  =  \zi  —  Zj\,  Zi,  Zj  G  and  using  the  parametrization  r]ij  =  q;(1  —  dij)  as 
described  in  Section  2,  the  likelihood  of  (cr,  Z)  can  be  made  arbitrarily  close  to  1  as  cr  — cxd 
for  fixed  Z  =  Z,  i.e.  the  data  are  d2-representable.  Such  a  representing  Z  is  plotted  in  panel 
(a)  of  Figure  3.  Family  9  is  the  Medicis,  whose  average  distance  to  others  is  greater  only 
than  that  of  families  13  and  16,  the  Ridolfis  and  Tornabuonis.  Another  d2-representation 
is  given  in  the  panel  (b)  of  Figure  3.  This  configuration  is  similar  in  structure  to  the  first, 
except  that  the  segments  9-1  and  9-14-10  have  been  rotated.  This  is  somewhat  of  an  artifact 
of  our  choice  of  dimension:  when  modeled  in  three  dimensions,  1  and  14  are  fit  as  being 
relatively  equidistant  from  6. 
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Figure  3:  Panels  (a)  and  (b)  are  alternate  d2  representations  of  the  Florentine  family  data. 
Panel  (c)  gives  marginal  posterior  distributions  of  family  positions. 
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Figure  4:  MCMC  diagnostics  for  the  Florentine  family  analysis 


11 


One  drawback  of  the  MLE’s  presented  above  is  that  they  overfit  the  data  in  a  sense,  as  the 
fitted  probabilities  of  ties  are  all  either  0  or  1  (or  nearly  so,  for  very  large  a).  Alternatively, 
a  prior  for  ol  can  be  formulated  to  keep  predictive  probabilities  more  in  line  with  our  beliefs; 
for  example,  that  the  probability  of  a  tie  rarely  goes  below  some  small  but  not  infinitesimal 
value.  Using  the  MCMC  procedure  outlined  in  Section  3,  the  marriage  data  were  analyzed 
using  an  exponential  prior  with  mean  2  for  ol  and  diffuse  independent  normal  priors  for  the 
components  of  Z  (mean  0,  standard  deviation  100).  The  MCMC  algorithm  was  run  for 
5  X  10®  scans,  output  being  saved  every  5000  scans.  This  chain  mixes  much  faster  than  that 
of  the  monk  example,  as  is  shown  in  the  diagnostic  plots  of  Figure  4.  Marginal  confidence 
regions  are  represented  by  plotting  samples  of  positions  from  the  Markov  chain,  shown  in 
panel  (c)  of  Figure  3.  Note  that  the  confidence  regions  include  both  the  configurations  given 
in  the  first  two  panels  of  Figure  3:  actors  14  and  10  (in  red  and  purple)  are  above  or  below 
actor  1  (in  green)  for  any  particular  sample;  the  observed  overlap  of  these  actors  in  the 
figure  is  due  to  the  bimodality  of  the  posterior  and  that  the  plot  gives  the  marginal  posterior 
distributions  of  each  actor. 

4.3  Classroom  Data 

Hansell’s  (1984)  data  measure  the  existence  of  strong  friendship  ties  between  13  boys  and 
14  girls  in  a  sixth-grade  classroom.  Each  student  was  asked  if  they  liked  each  other  student 
“a  lot”,  “some”,  or  “not  much”.  A  strong  friendship  tie  is  considered  present  if  a  student 
likes  another  student  “a  lot” . 

The  number  of  ties  sent  by  each  student  varies  considerably,  ranging  from  zero  to  19 
with  a  mean  of  5.8  and  a  standard  deviation  of  4.7  (the  standard  deviation  of  the  number 
of  ties  received  was  3.2).  For  this  reason,  we  choose  to  analyze  the  data  using  the  projection 
model  described  in  Section  2.2,  which  allows  for  a  variable  rate  in  sending  ties  across  stu¬ 
dents.  Additionally,  72%  of  the  ties  are  same-sex,  indicating  that  the  friendship  relation  is 
more  prevalent  within  sex.  Finally,  the  relations  are  transitive,  in  that  the  number  of  non- 
vacuously  transitive  ordered  triples  is  400,  compared  to  a  maximum  of  347  in  500  random 
reallocations  of  ties,  holding  constant  the  number  of  ties  sent  by  each  student. 

To  illustrate  the  features  of  the  projection  model,  we  fit  models  both  with  and  without 
covariate  information  on  the  sex  of  the  students,  that  is,  we  consider  both  of  the  following 
formulations: 

Projection  model,  no  covariate:  logit (pjj)  =  a  z[zj/\zj\. 

Projection  model,  one  covariate:  logit(pjj)  =  a  +  /3xij  ^  z\zjl\zj\. 
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Table  2:  Model  fitting  results  for  classroom  data 


Model 

Maximized  log-likelihood 

#  parameters 

Projection,  with  covariate 

-224.58 

55 

Stochastic  Blockmodel 

-227.57 

55 

Projection,  no  covariate 

-229.05 

54 

Figure  5:  Maximum  Likelihood  Estimates  of  Student  Positions,  and  Posterior  of  j3. 


(a) 

The  covariate  Xij  is  the  indicator  of  actors  i  and  j  being  of  the  same  sex.  We  also  compare 
these  models  to  the  stochastic  blockmodel  ht  of  Wang  and  Wong  (1987). 

Distance  estimates  for  both  models  were  hrst  obtained  by  calculating  the  average  of  the 
directed  path  lengths  between  each  pair.  Crude  positions  in  a  single  dimension  were  found 
using  Sammon’s  (1969)  non-linear  mapping.  These  positions  were  converted  into  positions 
on  a  circle,  which  became  the  starting  values  of  the  latent  vectors  in  the  optimization  routine. 
Randomly  sampled  starting  values  gave  the  same  optimum  ht,  given  in  Table  2.  The  projec¬ 
tion  model  with  sex  as  a  covariate  gives  the  best  ht,  with  the  coefhcient  {3  being  nominally 
signihcant  based  on  a  likelihood  ratio  test. 

Fitting  the  model  without  the  covariate  information  on  sex  gives  the  estimates  of  positions 
shown  in  panel  (a)  of  Figure  5.  Here  the  students  are  plotted  along  the  circumference  of  a 
circle  according  to  the  angle  of  their  latent  vector,  and  the  size  of  the  plotting  character  for 
a  student  is  increasing  in  the  magnitude  of  their  vector.  The  model  identihes  two  somewhat 
orthogonal  groups  of  actors,  falling  on  vectors  emanating  from  the  origin,  one  consisting 
of  mostly  boys  (□),  and  the  other  girls  (o)  (the  difference  between  boys’  and  girls’  median 
angles,  plotted  in  dashed  lines,  is  76  degrees). 

Note  that  if  the  sexes  were  separated  by  180  degrees,  then  based  on  the  model  it  would 
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be  improbable  for  actors  to  have  ties  to  both  boys  and  girls,  which  is  something  that  is  not 
completely  uncommon  in  the  data.  By  having  the  group  vectors  separated  by  76  degrees,  the 
model  predicts  ties  between  the  sexes  as  being  rare,  although  it  allows  for  a  non-negligible 
probability  of  some  actors  sending  ties  to  both  groups,  or  even  sending  ties  primarily  to 
members  of  the  opposite  group. 

A  further  application  of  the  projection  model  is  as  a  means  of  identifying  boys  and  girls 
who  may  be  in  similar  social  groups,  after  having  accounted  for  the  fact  that  the  frequency  of 
between-sex  friendship  ties  is  low.  The  estimated  positions  after  having  partially  accounted 
for  this  known  covariate  structure  are  shown  in  panel  (b)  of  Figure  5.  Note  there  is  still 
considerable  separation  of  the  sexes,  although  the  difference  in  median  angles  has  been 
reduced  to  60  degrees.  This  suggests  that  the  single  covariate  Xij  does  not  fully  explain  the 
different  rates  of  within  and  between  sex  friendship  ties.  A  “full”  model  would  have  different 
baseline  rates  for  the  four  different  types  of  ties  (boy— >^boy,  boy— ^-girl,  girl— >^girl,  girl— >-boy). 
Indeed,  inclusion  of  these  parameters  reduces  the  median  angle  between  the  sexes  to  13 
degrees.  We  present  only  the  model  with  the  single  covariate,  as  this  data  analysis  is  meant 
primarily  as  an  illustrative  example. 

The  above  model  could  be  also  be  used  as  a  means  of  making  inference  on  the  preference 
for  within-sex  friendship  ties:  a  naive  approach  to  inference  would  be  to  treat  each  possible 
tie  as  a  Bernoulli  random  variable,  independent  of  the  other  ties.  Using  logistic  regression, 
we  would  estimate  the  log-odds  ratio  of  a  between-sex  pair  being  friends  compared  to  that 
of  an  within-sex  pair  as  1.3,  with  a  standard  error  of  0.2.  Of  course,  we  would  expect  a 
conhdence  interval  based  on  such  an  analysis  to  be  too  small,  as  ties  between  individuals 
are  not  independent,  unconditional  on  the  latent  positions.  As  an  alternative,  a  Bayesian 
analysis  was  performed  as  outlined  in  Section  3.  A  Markov  chain  of  length  5  x  10®  scans  was 
constructed,  starting  at  the  MLE.  Output  was  saved  every  1000  scans,  which  was  then  used 
to  make  marginal  posterior  inference  on  (3.  The  marginal  posterior  density  of  /5  is  given  in 
panel  (c)  of  Figure  5,  in  which  the  solid  vertical  line  represents  the  MLE  from  the  projection 
model,  and  the  dashed  lines  represent  the  MLE  plus  and  minus  two  standard  errors,  based 
on  an  ordinary  logistic  regression.  As  we  expect,  a  95%  confidence  region  from  the  Bayesian 
analysis  would  be  longer  than  the  one  based  on  the  ordinary  logistic  regression. 

5  Discussion 

This  article  proposes  a  new  model  for  social  networks  based  on  spatial  representation,  for 
which  maximum  likelihood  and  Bayesian  inference  are  practical  to  implement.  The  approach 
has  some  advantages  over  existing  social  network  models  and  inferential  procedures.  First, 
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the  proposed  method  provides  a  visual,  interpretable  model-based  spatial  representation  of 
network  relationships.  Second,  it  improves  on  existing  methods  by  allowing  the  statistical 
uncertainty  in  the  social  space  to  be  quantified  and  graphically  represented.  Third,  it  is 
flexible  and  can  be  easily  generalized  to  allow  for  multiple  relationships,  ties  with  varying 
strengths  (using  generalized  linear  models),  and  time- varying  relations  (by  modeling  the 
latent  positions  as  stochastic  processes).  Fourth,  it  deals  easily  with  missing  data,  at  least  if 
information  on  ties  is  missing  at  random:  the  likelihood  includes  only  terms  corresponding  to 
observed  ties.  Finally,  the  model  is  inherently  transitive,  and  so  we  can  expect  an  improved 
fit  over  models  lacking  such  structure  (such  as  the  stochastic  blockmodel)  when  the  relations 
are  transitive  in  nature. 

The  choice  of  a  prior  distribution  for  latent  positions  was  not  discussed  at  length  in  this 
paper.  Although  simple,  the  diffuse  independent  normal  priors  presented  in  the  examples 
may  not  accurately  represent  prior  beliefs  about  the  structure  of  social  networks.  More  ap¬ 
propriate  might  be  clustered  point  processes  or  mixtures  of  normals  with  an  unknown  number 
of  components.  This  would  add  another  level  of  hierarchy  to  the  analysis,  although  the  re¬ 
sulting  model  would  be  more  flexible  and  perhaps  more  accurately  represent  any  tendencies 
of  populations  to  form  segregating  groups. 

As  an  alternative  to  the  models  presented  in  this  article,  multiple  dimensional  scaling 
(MDS)  is  widely  used  as  a  means  of  representing  the  spatial  structure  of  a  social  network 
(Breiger,  Boorman  and  Arable  1975;  Faust  and  Romney  1985).  In  this  context,  MDS  is  a 
class  of  methods  that  can  be  used  to  produce  a  spatial  representation  of  individuals  based 
on  similarity  or  dissimilarity  measures  between  pairs  of  individuals.  Such  applications  of 
MDS  differ  from  the  models  presented  here  in  that  MDS  is  used  primarily  as  a  data-analytic 
means  of  visualizing  given  dissimilarities  while  this  method  is  a  model-based  representation 
of  the  measured  relations  and  latent  positions  (although  recently  DeSarbo,  Kim,  and  Fong 
(1999)  and  Oh  and  Raftery  (2001)  have  developed  model-based  MDS  applicable  to  two-mode 
networks  within  a  Bayesian  framework).  Our  model  has  a  number  of  advantages  over  MDS. 
First,  our  method  directly  models  the  response,  while  the  usual  choices  for  dissimilarities 
in  MDS  are  ad  hoc  and  do  not  reflect  the  stochastic  nature  of  the  sociomatrix.  Second, 
current  versions  of  MDS  use  maximum  likelihood  or  other  optimization  methods  over  large 
numbers  of  parameters  (e.g.,  linear  in  the  number  of  individuals).  The  asymptotic  properties 
of  these  methods  are  largely  unknown,  and  the  uncertainty  in  the  latent  positions  is  difficult 
to  quantify.  To  avoid  this  some  versions  of  MDS  assume  that  individuals  can  be  grouped 
into  homogeneous  clusters-  so-called  latent  class  MDS  (Lazarsfeld  and  Henry  1968,  DeSarbo 
et  al.  1994).  However,  individual-specific  variability  in  relative  position  is  often  the  primary 
focus  in  the  social  network  context,  something  which  can  be  quantified  in  an  interpretable 


15 


way  via  a  Bayesian  analysis  of  one  of  the  position-based  models  discussed  in  this  article. 

R-code  for  implementing  the  proposed  methods  will  be  available  through  the  first  author’s 

website:  www .  stat .  Washington .  edu/hof  f  . 
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